SOCnflSHI SESOMB 



ED 137 345 



TB 006 1«6 



AUIHOB 
TITLE 



XNSTITDTION 

PUB HATE 
NOIE 



IDBS PBICE 
DESCRZETOBS 



IDENIJFISBS 



Kolen^ Michael J.| Ind Otters 

Hetiioas of Smoothing Double^lntry Ixpectancy Tables 
Ippliad to the prediction of Saccess in College, 
Beseareh Beport No. 91. 

Iowa Univ. y Iowa City. Evaluation and Examination 

Service* 

flar 77 

27p*i Paper presented at the Annual Heating of the 
American Iduoational Besearch Association (61 st^ Nev 
York^ New York, April 4^8^ 1977) 

MP*iO*83 HC^$2.06 Plus Postage* 

^*College . Entrance Examinations i -College Freshmen i ^ 
^Expectancy lablesr *Grade Point Average; ^Higher 
Education; Multiple Begression Ahalyais; Prediction; 
Predictor Variables! Probability; Secondary 
Education I *statistical Analysis; sHsuccess ractors 
ACT Assessment Program; *Sffioothing Methods 



ABSTEACT 

Six methods for smoothing double^entry expectancy 
tables . {tables that relate two predictor variables to probability of 
attaining a selected level of success on a criterion) were compared 
using data for entering students at 85 colleges and universities. ACT 
composite scores and self ^reported high school grade averages were 
used to construct expectancy tables based on data for students 
entering each institution in 1969-^1970. Tables were constructed using 
two levels of success^ — "C or better^* and "B or better" first semester 
grade point averages. The tables were smoothed using each method and 
evaluated according to how closely the smoothed tables corresponded 
to observed data a^c the same institutions in 1971-^72 and i^i 1972-^73* 
The smoothed tables were more accurate than those based on 1969^70 
observed relative freguencies- A linear regression of observed 
relative freguency on predictor value was most accurate; two 
extensions of an isotonic regression method were nearly as accurate. 
A commonly used regression method was found to be less accurate than 
most other methods. (Author) ^ — . 



^ III ||E III ^ 3|l ^ ^ III ^ l|i 3^ 3|[ l|( l9c 1}C 1^ 

* Documents acguired . by EBIC include many informal unpublished * 

* materials not available from otHer sources* EBIC makes every sef fort * 

* to obtain the best copy available- Nevertheless^ items of marginal * 

* reproducibility are often encountered and this affects the guality + 

* of the microfiche and hardcopy reproductions EBIC makes available * 

* via the EBIC Document Eeproductiou Service (EDfiS) • EDES is not ' ^' r',^.*- '^''] 

* responsible for the quality of the original document. Be productions * 

* supplied by EDBS are the best t^at can be made from the original- * 



ERIC 



-St 



METHODS OF SMOOTHING DOUBLE-ENTRY MPECTMCY TMLES 
^PLIED TO THE PREDICIiON OF SUCCESS IN COLLKE 



Michael J» Kolan 
William M* Smith 
Douglas E. Whitnay 



U$ OEPABTMENT OF MlALfM, 
NSTrONAL iNSTJTUTg OP 

'^r-i DtXuVrNt HAS RttN EiFPf^O- 

IMF Pfcy^ON 0^ 0«GANu'ATiQ^ QstQ^N^ 
« T ) o^r, i T {.^f N T s or V i e w QMi NtQNS 

tDUCATifj^^ f'OS'MGN Of* PQl iff 



Research Report No. 91 



Preeented at the Annual Meeting 
American Educational Research Association 
New York, New York 
April, 1977 



CD 



Evaluation and EKaminatioti Servij;ey 
The University of Iowa \ 
lowG City, Iowa 

ttarch, 1977 



2 



]ffiTHODS OF SMOOTHING DOUBLE-ENTRY IXPECT^CY TABLES 
APFLim TO THE PRmiCTION OF SUCCESS IN COLLEGE 



Michael J* Kolen, William M* Smith and Douglas R* Whitney 

UnlVOTslty of lo\m 

— - = ^ - .- ^_ . 

Six methodi for smoothing double-entry eKpectaney tables (tables that 
relate two predictor variables to probability of attainijig a selected level 
of success on a ct^iterion) were compared using data for entering students 
at 85 colleges and universities. ACT composite scores and self ^reported 
high school grade averages wer.v used to construct eKpectancy tables based 
on data for students entering each institution in 1969-70. Tables were con-- 
etructed using two levels of succesa™"C or better" and "B or better" first 
semester grade point averages* The tables were smoothed using each method 
and evaluated according to how closely the smoothed tables corresponded to 
observed data at the same institutions in 1971-^72 and in 1972-73- The 
smoothed tables were more accurate than those based on 1969-70 observed 
relative frequencies, A linear regression of observed relative frequency on 
predictor value was most accurate; two extensions of an isotonic regression 
method were nearly as accurate, A conmonly used regression method was found 
to be less accurate than most other methods. 



METHODS OF SMOOTHING DOUBLE-^ENTRY mCPECTMJCY TABLES 
APPLIED TO THE PREDICTION OF SUCCESS IN COLLIGE 

The probability that an Individual will attain a certain level of 
succais is often the measure of priJnary interest in guidance and placement 
decisions and in the aelactlon of qualified applicants. Expectancy tables 
display the likelihood of success on some criterion measure given various 
levels of perfottaance on one or more predictors related to success* 

One factor that' datracts, from the usefulness of expectancy tables is 
that ihance irregularities can occur when the tables are conetructed directly 
from observed data, especially when-relative frequencies are computed for 
categories containing few observations (Mastasis 1976). These can make it 
appear that the probability of ouccess is smller for high ability students 
than for lower ability students when theory and experience Indicate othemise. 
When such apparent contradictions occurs it is desirable to "smooth-' the ex- 
pectancy table to remove the logical inconsistencies* Farrin a« Whitney (1976) 
studied a number of methods for emoothlpg single-entry eKpectancy tables and 
concluded that the gains in accuracy which resulted warranted their use for 
tables used in college admissions^ guidance^ or planning purposes* 

When two predictors are available, the probability of attaining a chosen 
level of success on the criterion can be dlsplayeid In a double-entry axpactancy 
table (Wesmani 1966)* Score Intervals on the two predictors are listed at 
the margins of such double-entry es^ectaney tables. The values in the body . 
of the table estimated the probability of achieving success given the row and 
column values on the predictora. These probabilities can be estlmae.ed directly 
from the observad relative frequencies or by the use of various smoothing 
techniquas* 



» 2 - 



The concern of this paper is lAth situations in which it can be prasimed 
that the two predictor variables have a monotonlc nondecreasing relationship 
with probability of success and with each other. The problem examined here 
arose in the process of constructing expectancy tables using aahlev^ent test 
scores and high school percentile rank to estimate probabilities of attaining 
aertain grade point averages, k nimber of the estinatad probabilities in thA 
tables constructed "iislng the observed relative frequenciee contradicted our-^^^^^^^^- - 
belief s about the nature of the imderlying relationship among thr. variables. 
That is s the assumption of monotonicity seems reasonable for tables reflecting 
test scores and high school grades used to estimate the probability of attaining 
certain grade average levels in college. 

American College Test (ACT) composite scores and self reported high school 
grade average (HSA) were used to estimate the proportion of etudents evidencing 
two levels of success In college* The two levels were earning a first term 
grade point average (GPA) of "C or better^^and "B or better". Double-entry 
expectancy tables were constructed directly from the observed relative frequen- 
cies and also by applying various smoothing techniques to these relative 
frequencies/ Two indexes reflect tog the stoilarity between cross-validation 
year relative frequencies and estimated relative frequencies were used to 
evaluate the smoothing methods.^ 

Construction Methods 
Method 1,1 Observed Relative Frequencies 1 
The most conmon method for constructing double-entry expectancy tables 

"The use of base year data to prepare tables for use with students in sub- 
eequent years suggest that » for each institution, joint probabilities 
exist and are identical from yaar-to-year. The degree to which each' 
method *s estimate resembles these parameters will be studied in anothar 

- paper* " ■ v . ' ■■-"•^ = - —-^ ■ ■ 



-3 - 



i. ■ 



is to report the observed relative frequency of success at each combination 

of the two predictor variables* For example, if 30% of the individuals with 

ACT scores between 20 and 25 and HSA between 2.6 and 3,3 were successful in 

a particular year, a value of ,30 would be placed in the appropriate cell of 

the table* Thus , the observed relative frequencies from one year are used 

to predict the relative frequencies in subsequent years* 

Method 2a I Linear Re gr e j s Ion on GPA 

The original data was used to estimate the intercept ^ regression weights, 

and standard error of estimate for a mutliple linear regression model using 

GPA as the criterion variable. The entry for each cell in the expectaney table 

2 

was estimated by substituting the center of each ACT and HSA interval into the 
regression aquation and calculating the standard normal deviate corresponding 
to the differance between the predicted GPA and the selected level of success* 
The cell entry is then the ciaaulative distribution function at the norral 
deviate value* This method assumes a plattar relationship among GPA , HSA and 
ACT and normal, hoMBcedastic conditional distributions. 
Method 2b I Curvilinear Regression on GPA 

Method 2a was amended to include the square of each predictor variable 
and the cross-product term in the equation for predicting GPA* This method 
allowed for a wider range of possible relationships among GPAj ACT^ and HSA 
than did Method 2a* Otherwises it was identical to Method 2a. 



2 

The center of each interval was' defined as the mean of the interval under 
the assumption that the marginal dletributioiis of ACT and HSA were normal* 
The mean of the Interval in unit' normal deviates from the overall mean isi 

l^interval " =T~ 

. y and y^ are the ordinates of the unit normal curve correHpondInK rcHpcjc- 

tively, to the low and high endpointa of the interval, and p iH the area 
" " under the unit normal curve that lies between the endpointa of the interval. 

ERIC 



- 4 - 



Method _3ar Linear Ragresslon on Observed Ralatlva Frequencies ; 

ACT and USA were regressed on the relative frequencies in the unsmoothed 
table (each relative frequency was weighted by the number of cases In that 
cell). The center of the intervals were again used as the values for ACT 
and HSA In the eomputations * The egtlmated Intercept and regression 
weights were used to generate the smoothed expectancy table. Any predicted 
relative frequencies less than Eero or greater than one were set equal to 
the appropriate limit t This method assumes that a planar relationship 
eKlsted among' probability of success, HSA and ACT scores* 
Method 3b 1 Curvilinear Regression on Observed Relative Frequencies. 

Method 3a was amended to include the square of each predictor variable 
and the cross-product term in the equation for predicting the relative fre- 
quencies of success. This method (like 2a) allowed for a wider range of 
relationships among probability of success^ ACT, and HSA, 
Method Extensions of the Isontonic R^^^ression 

" The Isotonic regression method (Ayer, Brunk, Ewlng, Reed, and Silverman, 
1955), which assumes only that the relationship between probability of success 
and the predictor variable is non-decreasing monotonic, is very straight- 
forward in single-entry tables. After forming the eKpectancy table, the 
table is examined for reversals* Each reversal, where relatively fewer 
students with higher predictor values achieved success. Is considered as a 
chance reversal because It violates the assumption that the relationship be- 
tween predictor and criterion is non-decreasirig mono tonic* When such a rever- 
sal Is encountered, the two (or more) relative frequenciesH^^nvolved in the re-* 
versal are weighted by the number of observations in each of tte ceirB and 
nverfigud. Thta nvtsriip^u ruplniiQu t\\^ nhnv/rvud relaLivu fr^funmnl hh In iiiifth 
cell involved in the reversal. The process continues until there are no : 

reversals remaining. 7 



The computationE Involved in extending this method to doubla^entry 
tables s while still preserving its mathetnatical qumlities dlsQugsed by Ayer» 

al (1955) s appear to be intraotable- The extensions reported here are 
attempts to extend the logic (though they do not necessarily retain the 
mathematical properties), of the single-entry method to smooth double-entry 
tables. 

In the two extensions of this method » the only assimptions about the 
form of the relationship among ACT, HSA, and probability of suocess are that 
for any given ACT level, the relationship between HSA and the probability of 
sucoess is non-deQreasing monotonia and that for any given HSA level, the 
relationship between ACT and probability of success is non-decreasing mono^ 
tonic (i.e., the conditional distributions of each prsdictor and oritarion 
given the other predictor are non^decreasing monotonlc). Th^ae asaimiptions 
do not, for examplep indicate whether or not individuals with an ACT score of 
25 and a HSA of 2«7 should have as high an estimated probability of succesB 
than Indivlduale with an ACT score of 24 and a HSA of 3.0, The extensions 
to double'-entry tables involved two methods of resolving this problem; other 
defensible (and possibly better) extensions eKist. 

Method 4a : Alternately Treating Rows and Colums as Single^Mtry Tables 
™ —In-^the first phase, each row in the table was smoothad (l£ riecessary) 
by the method described abovfe for single^^enti^ tables. After ail rows 
had been adjusted, the table was examined colum by colum to determine if 
any reversals remained. My reversals In the oolunms were adjusted by the 
saTne procedure that was applied to the rows. Thm second phase involved 
returning to the original table and smoothing first by columns and then by 
rows. Each of the two phases produeed usable, "although often different, 



^ 6 - 



solutions. Since thera was no reason to prefer one soiutlon over the other, 

the oell entries in the two resulting tables were averaged and eKmlned to 

determine if any reversals existed. If reversals still reBsained* the entire 

3 

proeedura was repeated on the averaged table as often as necessary. Cells 
that contained zmto observations in the original data were treated as zero 
frequency and zero relative frequency. The cell frequencies involved in 
reversals were also averaged beeause, without this proeedurei the solution 
would often not converge or would converge very slowly* 

Figure 1 illustrates this smoothing method applied to a 3 x 4 eKpectaney 
table. In the second row In Figure lb, the values .300, .200, and .200 in 
the unsmoothed table represent a reversal. These values were averaged 
(weighted by cell frequeneles) to obtain the vglue *211. Also note .that the 
eell frequenelea were averaged to produce the new cell frequency values of 
31.7. The resulting table (Figure If) was obtained by averaging the cell 
entries and frequencies In the tables In Figure Ic and Figure le. Since no 
reversals were present in the resulting eKpectancy table, no further smooth- 
ing was required. 

Insert Figure 1 about here 



Method 4b ; Linear Regression Weights Used to Provide a^ Single Dlmeaalonal Ordering 

If the cells of the table could be ordered in such a way that one would 
know, for eK^ple, that Individuals with an ACT score of 25 and a HSA of 2,7 
have at least as great an estimated probability of success as individuals 



3 ^_ ■ ■ . - ■ 

Of the 170 tables amoothed in this study , 152 (89%) required no repatltlons 

and the remaining 18 tables required only one repetition of the procedure. . 

Thus, this procedure appears to converge rapidly. 



EKLC 



with an ACT score of 24 and a HSA of 3*0, then the single-entry method 
could be employed* To approximate this information^ the regression weights 
from Method 3a were used to provide an imequivocal ordering of the cells. 
If the predicted value for one cell was greater than the predicted value 
for another cell* then the probability of success for individuals with 
scores corresponding to the first cell was assumed to be at least as great 
as the probability of success for individuals with scores corresponding to 
the second. In this way a single-entry table was constructed* The single- 
entry smoothing method was then applied and the table reconstructed. This 
solution avoided some of the problems encountered in Method 4a, 

An eKample of this method Is shown In Figure 2, Baaed on the regress- 
Ibn weights co^uted from the data sho^m in Figure la* the cells were ordered 
as in Figure Za, smoothed using the single-entry method In Figure 2b, and 
reconstructed in Figure 2c. As would be eKpectedp Methods 4a and 4b pro- 
duced different results, although the results were usually similar for cells 
that initially contained a relatively large nmober of observations. 

Insert Figure 2 about here 



Procedure 

The data for this, study consisted of the records of entering freshman 
students at a smple (stratified by college type) of 85 institutions that 
participated in one of the ACT Research Services for the years,.1969-70r"^^ 
1971-72 and 1972-73. Perrln & jitney (1976) provide a more complete descrip 
tlon of the institutions, (Henceforth, 1969-70 will be referred to as the 
base year and 1971-72 and 1972-73 as, respectively, validation year one and 



10 



- 8 - 



two.) For each student completing the first term In one of these schools 
during tha base year or either of the validation years ^ ACT Assessment com- 
posite score J average of four self -reported high school grades and first 
term grade-point average (GPA) at the institution was obtained. 
Construction and Smoothing Expectmcy Tables 

SiK 5x5 expectancy tables ware constructed for each Institution using 
the observed relative frequencies (one for each of the three years and at 
each of the two levels of success) « The two base year tables for each 
school were smoothed using each of the six methods* In order to 
standardize the tables, the ACT and HSA variables were divided into 
five intervals* each approKimately one standard deviation in width. 
The five ACT composite intervals (based on a mean of approKimately 20 
and standard deviation of 5) were 12.5 or below, 12.5 to 17.5, 17.5 to 22.5, 
22.5 to 27.5 and 27*5 or above. The five HSA intervals (based on the mean 
of 2.6 and standard deviation of 0*7 reported In the ACT Basic Research Report 
of 1970-71) were 1,55 or below, 1.55 to 2.25, 2.25 to 2.95, 2,95 to 3*65, 
and 3*65 or above. 
Criteria 

The two criteria reflected the degree to which each set of relative fre- 
quencies esttoated from the base year data corresponded to the relative fre- 
quencies observed In each of the validation years* In each case, smaller 
values of the IndeK reflect more accurate estimation. 

The first index weighted each of the errors in predicting the relative 
frequencies in the validation year equally. This index would be expected 
to Identify the method (a) producing the smallest average predictive error 



" 9 - 



across all cells of the table. The first criterion 



measure wasi 



D ^ 
"1 



in n ^ * 

ik iii - 



1/2 



where D^^ Is the root mean-squared error in estimating relative frequencies , 
P^j is the^^bserved relative frequency of success in the cross-validation 

year at Interval i of predictor 1 and, interval j of predictor 2^ w ^ is the 

: . _ . . . . .... ... " . ' . U . 

relative frequency predicted by the model, m is the number of Intervals on 
variable 1, and n Is the number of intervals on variable 2* If no observa-- 
tions resulted for a cell in the table in the validation year^ then that term 
was not included ^d the denominator was reduced by one* 

The second index associated greater seriousness with prediction error 
for ceils of the table containing a larger niunber of observation* This in*- 
dex would be expected to identify the method (a) producing the smallest averagi 
piedictlon error for the individual observations in the validation year. 
The second criterion measure was' 



"2 ■ 



Ji iii - '\^'>'' 



m n 
E E f 
i-1 J-1 -ij 



1/2 



Where is a weighted Pleasure of the error in estimating frequencies, f., 
is the niimber of observations in the ijth cell of the validation year table, 
and the other symbols are as defined above. Interpretations of the relative 

validities of the methods of constructing doubles-entry expactancy tables 
ware based an the vulugtt of Dj^ and D^, 



12 



ERIC 



— " ■ - - 10— ; , ... 

. y-ff ft = . ... ... ... 

Analysis of Data " 

A four factor mixed analysis of variance procedure was applied to each 
of the crltHriota measures (Myers, 1972), Factors in the analysis were size 
of class (actually, number of first term students with ACT data-five levels), 
expectancy^able construction method (seven levels) s CPA value (two levels) ^ 
and validation year (two levele). Sige of class was considered a "between" 
effect and all other factors were treated as "within" effects. The unit of 
analysis, school within types, was considered to be a "random" effect and 
all other factors were considered '^flKed" effects, ... 

•' Results • • 

The results from the analysis of variance for the criterion measures 
are presented in Tables Land 2* All tests (incluHng post-hoc tests) were 
conducted at the » 01 level of significance* Unbiased estimates, of the ^'''^ 
variance components (Myers, 1972) are also provided. 

Insert Tables 1 and 2 about here 

The appropriate means for the main effects and interactlona which sur- 
^ passed" the ' * 01 level^ of ^ s ignif Icance a re/provlded In" Tables^S^^ a The 
Tukey (1953) critical differences for comparisons between any pair of means 
are provided in Table 5* The following discussion is based on these pair- 
^se comparisons* Since the interactions involving methods were essentially 
ordinal, the main effects are discussed prior to the interactions. 

Since there was only one observation per cell , no direct estimate of o^ 
was possible* It was assumed that the variance component for- the 
hlBheBt nrdcr pooled intarnctlon (S^^^Y^iy) w/ih qqufll to ^nrrn Tn thfw 
= way waii tttitinuiLtid ^^y^^gMYC'/^' ' / Alau , iirLttr Lhy varlaiiitu 

components were computed, cne 'estimates that were negative were replaced 
by zero. Thus, these estimates are no longer unbiased. . 

. : , ■ ■ ' ■ .: ^ ■ ■ ■ .■ , ;. 1 3 / ' . - ■ :::v: ^;^.\ - 



Inaart Tables 3, 4^ and 5 about here 

Haln Effects 

The relative frequenciea estimated by any of the alx amoothing methods w 
more accurate than the estlmatea provided using the obasrved relative f re*- 
quencles (Method 1). In addition^ Method 3a was more accurate than ^ 
Method 2a for the criterion^ Methods 3a, 3b, 4a and^ 4^ 
accurate than Methods 2a and 2b for the criterion. 

As would ba^eaqpectad, the accuracy of all methods increased as class / 
size increased and the methods were generally irore accurate for. validation 
year 1 than for validation year 2^ In addition, all methods were more 
accurate for the "B or better" level than for the'*'C or better" level, ^ 
Interaction of Methods and Glass Sige . _ - 

The general tendency for accuracy to Increase for all method a aa 
class slge increased held for both criterion measures. For the D^^ crite- 
rion, all of the smoothing methods were more accurate than was Method 1 
acroaa ^11 claas si^e levels* No substantial differences M^ng the re- 
nsiulng methods were, noted at any of the class siEe j Leyels* : _ : ; - 

For the two smiles t class size intervals, the relative frequencies 
astimated by any of the smoothing methods were more accurate than the 
estimates produced by Method 1 using the criterion measure. However, 
as class size increased, the relative accuracy of Method 2a (and, to some 
extant. Method 2b) decreased mth respect to the remaining methods, Includ-' 
ing Method 1. No substantial differences among the means of Methods 3a, 
3b, 4a, and 4b were obaervad at any of the class size levelsy Based on 



■these resultSp Methods 3a, 3bV 4a, and 4b provided more accurate estimated 
relatlva frequencies at a wider range of clasa size levels than did the 
remaining methods. 

Interactions of Method with GPA Level ^ and/or Validation Year . 

Based on the indext the smoothing methods} resulted in more accurate 
estimation of validation year relative frequencies than did Method 1 at all 
GPA level and validation year combinations* The accuracy of the smoothing 
methoda did not differ substantially for the tables computed using the 
"B or better" level of succe^^ Method 3a was ^ hm^everp superior to the 
other methods for the validation year 1 and "C or better" GPA combination* 
For the validation year 2 and "C or better" GPA combinations Method 3a was 
the most accurate while Methods 2a and 2b tende4^ to be^ the least ^accurate 
of all the methods except Method 1^ 

■ The smoothing methods were also more accurate than Method 1 at both 
GPA layels according to the criterion measure. The accuracy of the 
smoothing methods did not differ substantially at the '?B or better" GPA 
level* Methods 3a^ 3h^ 4as and 4b were more accurate than Metliods 2a and 
2b at the "G or^ better"^^ 1^ Method 3a was at least as accurate 

as the other methods for all combinations of class size^ GPA^ and valida- 
tion year levels, 

Acctiracy Gained by Using Smoothing Methods * 

In order to reflect the degree to which each smoothing method improved- 
on the predictions from observed relative fraquenciea (Method 1), the average 
percentage ffain In accuracy by using each of the smoothing methodB Is pre^ 
eented In Table 6. These values indicate that, overall, the use of each 
smoothing method resulted in a gain In predictive accuracy of about 25% for 



the Dj^ index. The gains In preaictlve accuracy based on the Index were 
about 20% for Methods 3a, 3b, 4a, 4b but only 10% for Methods 2a and 2b. ^ ^ 
Similar values were computed for the Index (comparable to our Dj^ indeK)- used = 
by Perrin and t^itney (1976) | their smoothing methods for single-entry tables 
resulted in a gain in predictive accuracy of from 25% to 32%, 

Insert Table 6 about here 



Discussion 

The use of smoothing methods resulted in a practically significant 
increase in predictive accuracy in both this study and that of Per rin and — 
Whitney (1976). In the present study , the relative size of the estimated 
variance components for methods and for the interaction of methods with the 
other variables suggest that methods contributed substantially to the total 
variance of the model. The present study and that of Perrin (197 4) suggest 
that Methods 2a and 2b result in a substantial increase^in average predictive 
accuracy across cells (as reflected by the index). However, this increase 
in accuracy was surpassed by some of the other methods. These studies also 
suggest that Methods 2a and 2b result in only a minimal increase in predict iv( 
accuracy for individual observations (as reflected by the D2 index). Both of 
these types of accuracy are desirable in most educational situations* Since 
Methods 2a and 2b did not provide a substantial gain in the latter type of 
accuracy, these comonly used regression methods (Schraeder, 1965) are less 
appropriate than are other methods- 

Of the construction methods studied, Method 3a (linear regression on 
observed relative frequuncltts) waa at leuBt ao uccurata as any of the ochur^ 



14> 



methods for both criterion measures at all combinations of GPA^ class size,- 
and. validation year levels. Method 3a would be generally preferred for 
double-entry tables like those studied. 

The gaia in predictive accuracy from using either eKtension of the 
Isotonic regression method was nearly as great (overall and for most GPA, 
class size, and validation year levels) as was that from using Method 3a. 
Because Method 4b required the computation of the regression estimates used 
in Method 3a and subsequently employed the isotonic regression method iov 
single-entry tables * Method 4b is relatively complex. Sc even though the 
use of Method 4b produced estimtes nearly as accurate as those produced' by 
Method 3a, the compleKity of this method does not suggest its use. Because 
the computations involved in using- Method 4a are relatively simple (in fact, 
clerical personnel could easily use this method aided only by a pockety 
calculator) this method would be preferred when access to a computer is not 
available. Method 4a would also be preferred when the more strtagent 
statistical assumptions of Method 3a are not likely to be met . 

^y of the smoothing methods studied remove logical contradictions 
between observed data and beliefs regarding the actual relationship among 
predictor and success measures. Smoothing would alw be expected t increase 
the accuracy of prediction when constructing tables to be used for admlssioa, 
guidance, or planning purposes. If such methods are used, however, the 
traditional regression methods O^ethods 2a and 2b) can not be expected to be 
as accurate as the other methods studied. Methods 3a and 4a are especially 
recommended for this purpose. 



17 - 



- 15 - 



Anaetaeip A, A> pgyahologieal Tagtltig . 4th Ed/ New Yorkr. MaeMlllan 
Publishing Company* Inct, 1976. 

Ayer, M., Brunk^ h/ , Ewing, G; > Reld W*T,, & Silverman * E. An 

:topir±eal distribution f unetion for sai^llng with Incomplete information. 
Annals of MathOTatlcal Statistics, 1955 * 26p 641-647. — - 

Myers, J. L. fundamentals of s^qsarimental daeign ^ Bostonr Allyn and 

.BacQttj . Inc. , : 1972, . / ■ .-.^.l; • • 

Parrlnj D, W. A comparison of methods for smoothing expectancy tables 
as applied to the prediction of success in College^^CDoatp 
Unlveraity of Iowa, 1974) . Ann Arbor ^ Michigan s University Hlcrof ilmSp 
- 1974, No* TS-'iaSOS. ; 

Perrinp D, W. i Wiltney, D. R. Methods , for smoothing expectancy tables . 
applied tci the prediction of success in college. Journal of Educational 
Measureman t; 1976, 13^ 223--231. ; 

Schraeder, W. B* A taKonomy of e^ectancy tables ^ Journal of Educational 
Measurement , 19 65 p 2 , 29-35 ■ ^^.^^^^^^^^^^^^ ^^^^^^^^^^ : . . ^ 

Tukey, J. W. The problems of multiple comparisons. BittOp Princeton 
University p 1953 p pp, 396. 

Weaoan, A. Double-entry eKpectancy tables^ Test Service Bulletin of the 
Fsychological Corporations 1966« ^^.^^ ^^^^^^-^^^^^^ . 



ERIC 



18 



- 16 
PradicEor 1 



Low 



High 



High 








20 








49 




MO 


.600 


.500 


.800 


Predictor 




10 




80 






20 


■. ■ 2 


.300 


,200 


.200 


^700 


Low 








20 




5 




-0. 




.100 


.400 


.600 


,00 



Figure la. Unisoothad data 
coaicrueted by Method 1. 



Fredletor 1 
Low Hilh 



High 



Fredletor 
2 



Low 





ICU 










1 ^0 


.100 


.520 


.520 


.800 




31,7 




31.7 




31.7 


20 




111 


.211 


,211 


.700 




SO 




20 






2.5 


.100 


.400 


.600 


.600 



Flgura lb. Rows adjust ad. 



PradloEor 1 
Low . ' High 



Mgh 



Pradlctor 
2 



Low 



' 120.8 
.184 



.184 



I 80": 
,100 



i 50 

.520 



2^ 



,284 



.184 



.520 



Il7.1 
.239 



2^ 



.239 



800. 



.700 



:60Q 



Figure Ic. Rows adjustad first 
and than aol\iEffis. 



Pradlctor 1 



Low 



High 



Predictor 

2 



Low 



10 



.200 



10_ 



.200 



ISO 
.100 



20_ 



.600 



50 



.240 



50 



.240 



SO 



.500 



.600 



.400 



Hi|h 



40 



.800 



20 



.700 



.00 



Flgura Id. ColuTOS adjustad- 



Pradlccor 1 



High 



Predictor 

2 „ 



XjOW 





_10_ 




50 




50 




.200 


.520 


.520 


.800 




10 




50 




5 




20 




!00 


.240 


,400 


,700 




80 




50 




.2,1 






,100 


.240 


.400 


.400 



Figure la. Colusms adjusted 
first and than rows. 



Predictor 1 



High 



PradlGter 
2 



Low 



llB.^ 
^192 


50 


50 


.800 


.520 


.520 


15.4 


37.9 




Liii 

.700 


.192 


.262 


.320 


80 


.262 


.320 


,500 


.100 



Figure If , Avaraglng of two 
double^ad justed tables. 



Figura 1. Illustration of amoothiiig as thod :4a. - J^.Q . 



- 17 - 
TABLE 1 



MO VA Sin-WARY TABLE FOR D- INDEX 



,, Source 

Between SchoQla 

Z (Glass BiEs) . 
S(SohQDl)/Z(SlM) 

Within Schools 

M(Msthod) 
ZM(Slze X Method) 
SM/Z(Sehool X Method/Size) 

Y (Validation Year) 
ZY(Slze X Year) 
SY/Z<School X Year/Slge) 

G (CPA Level) 

ZG(Slze X GPA Level) 

SG/Z (School X GPA Level/Size) 

MY(MetKod X Year) 
ZMYCSizeX Method X Year) 
SMY/E(School X Method X Year/Size) 



MG(MethodX GPA Level) 

SMGCSlze X Method X GPA Level) 

SMG/Z (School X Method X GPA Level/Size) 

YG(Year X GPA Level) 

ZYG(Slze XYear X GPA Level) , 

SYG/Z(SchQol X Year X GPA Level/Size) 

MYGCMethod X Year X GPA Level) 
ZMyO (Size X Method X Year X GPA Level) 
SMYG/Z (School X Method X Year X 
GPA Level/Size) 



84 

4 

80 

2295 

6 

24 
480 

1 

4 

80 

1 

4 
80 

6 
24 
480 

6 
24 
480 

1 

30 

6 
24 
480 



Mean ' 
Square 



0.57151 
0.03336 



0.28053 
0.01327 
0.00251 

0.16749 
0.00678 
0.01695 

0.64152 
0.02227 
0.02590 

0.00040 
0.00064 
0.00087 

0.04146 
0.00187 
0.00173 

6.00512 
0.03864 
0.01614 

0.00242 
0.00030 
0.00065 



Mean 
Square 
Ratio 



EBtiniated 
Variance 
Components 



17.13 



111.75^ 
5,28 



9.88 
0.40 



24. 77 
0.86 



,0.46 
0.74 



23.99 
1.08 



0.32 

2.39 



3.75 
0.46 



,000904 
.001010 



.000701 
.000108 
.000376 

.000063 
.000000 
.000548 

.000259 
.000000' 
. 000849 

.oooooo! 

.000000 
.000046 

.000100 
. 000001 
.000218 

i 

.000000 
.000038 
.000521 

.000005 
.000000 
.000000 



*** 



.*** 



*** 



Irror 
Total 



*p<.01 V 

Is zero by aBSumptlon 

'AAA ■'■ ■ ■ ■ 

Negative estlfflittas replaced by zero 



2379 



f 



,000645 
.006371 



erIc - 



-20 



TABLE 2 



ANOVA SU^IARY TABLE FOR INDEX 



Mean 



Between Schoois 



E(Claes Size) 
S(Sch0Di)/ZCSi2s) 

Within SehoDls 



M(Met 
ZMCSl^a X Methad) 
SM/Z (School X Meth©d/S±^e) 



Y (Validation Year) 

ZY(Size X Year) 

SY/Z (School X Year/Slge) 



G(GPA Level) 

ZG(Size X GPA Level). :^ .' 

: SG/Z (School X GPALevel/Sise) 

>ff (Method X Year) 
ZMY(Si2e X Method X Year) 
SMY/Z (School X Method X Year/Size) 

JiG(Method X GPA Level) 
iSm (Size X Method X GPA Level) 
SMG/Z (School X Method X GPA Lavel/Si^e) 



YG (Year X GPA Level) 

ZYG(Size X Year X GPA Level) 

SYG/Z (School X Year X GPA Level/Size) 

MYG (Method X Year X GPA Level) " 
Z^DfG(Size X Method X Year X GPA Level) 
Sl^G/Z (School X Method X Year X 
GPA Level/SlEe) 



df 


Square 












0.02830 


2295 _ 




6 


0.05326 




U . UUOqO 






1 


0.07604 






qU 




1 


0.12149 




U. Uylo^ 


oU 


A ni nsQ 
U* yj.Uo^ 


6 


0.00018 


24 


0.00011 


480 


0.00020 


6 


0*01052 


24 


0*00122 


480 


0.00093 


1 


0*01550 


4 


0*01142 


80 


0*00230 


6 


0.00026 


24 


0,00017 


480 


0*00016 



Meat! 
Square 
Ratio 



15.85 



43,75^ 
7.28 



9*39 
0.36 

it 

11.15 

o:i7 ■ 



0*89 
0.54 



11. 35 
1,32 



5.18^ 
3*81 



1.58 
1.06 



Estimated 
Variance 
Componentg 



.000706 
.000946 



. 000131 
.000077 
.000213 

.000029^^* 

.000000 

,000267 



. 000046 

roooooo" 

.000360 



.000000, 
.000000' 
.000007 



.000024 
.000003 
.000155 

.000005 
.000614 
.000072 

, 000000 
.000000 
. 000000 



Error 



. 000162 



Total 



2379 



.003241 



-19 - 















































Iso 


"IT 




10 




20 






20 




5 




5 




80 




0 




20 




.100 


.300 


.100 


.400 


.200 


.600 


.600 


.200 


.500 


.000 


.700 


= 800 . 



Figure 2a. Ordaring of ceila* with ragression weight i ^ *0367^b^^^^ ,1893, and 





80 




10 




10 




50 




50 




22 




22 




22 




22 




22 




20 




.100 


.200 


.200 


.240 


.240 


.509 


.509 


.509 


.509 


.509 


.700 


.800 . 



Figure 2b, Smoothed values. 



Predictor 1 
Low Hlsh 



High 



Predictor 
2 



Low 



10 



200 



10 



.200 



80 



,100 



,509 



50 



.240 



50 



.240 



22 



.509 



22 



.509 



22 



■ 509 



40 



,800 



20 



.700 



22 



,509 



Fleure 2c. Reconstructed, smoothed double-entry table. 



Figure 2. Illustration of smoothing method 4b. 



22 



ERIC 



TABLE 3 - 

MEANS ON. CRITERION D 



Smoothing 
Method 


GPA Levil 


Class Slae 


All 


B or Bettir 1 


C Qr Btttar 


Vilidatlon Year . 


98- 


160- 


220- 


340- 
1149 


1150- 

■4193 




Yiir 1 


Yiar 2 


Year 1 


Yiar 2 




1 


.2452 


.2712 


.3358 


.3360 


,3110 


!5 p F 0 

.3558 


.2995 






.2971 


2a 


,2000 


.2187 


.2404 


.2330 


.2323 


.2557 


,2397 


.mi 


.1833 


.2280 


2b . 


.196? 


.2204 


.2340 


.2493 


.2461 


.2559 


.2345 


.2130 


.1759 


.2251 


3a 


.2020 


.2168 


.2084 


.2281 


.2279 


.2375 


.2306 


.2053 


.1678 


,2138 


- 3b - 


-.2062 


.2218 


.2200 


.2408 


.2424 


.2487 


.2327 


.2071 


.1801 


.2222 


4a 


.19?1 


.2202 


.2294 


.2426 


.2457 


.2503 


.2366 


joe3 


.1707 


.2223- 


4b 


.1978 


.2137 


.2276 


.2424 


.2438 


.2480 


.2386 


.2061 


.1634 


.2204 


All Methods 


.2064 


.2261 


.2422 


.2160 


.2559 


.2646 


.2446 


,2200 


.1784 





- 21 = 



■ ■ - : ' V - TABLE 4 

MEANS* ON CRITERION D 



Smoothing 


CPA 


Level 




Class Size 






i . Method 


B or 
Better 


C or 
Better 




98- 
159 


160- 
219 


220- 
339 


340- 
1149 


1150- 
4193 




All 




.1655 


.1937 




.2284 


.2240 


. 1897 


.1512 


.1041 




.1796 


2a 


.1495 


.1725 




.1689 


.1824 


1721 


.1506 


.1311 




.1610 


2b 


.1496 


.1739- 




.1780 


.1842 


.1725 


.1466 


.1274 




. 1617 


3a 


.1442 


.1430 




.1604 


.1657 


.1570 


. 1317 


.1032 




.1436 


3b 


.1462 


.1500 




.1685 


.1729 


.1567 


.1284 


.1139 




.1481 


4a 


.1442 


. 1548 




• 1761 


.1785 


.1611 


.1332 


.0986 




.1495 


4b 


.1422 


.1535 




.1767 


.1749 


.1609 


. 1307 


.0961 




.1479 


All Methods 


.1488 


.1631 




.1796 


.1832 


.1671 


.1390 


.1106 







Xn addition, .the oyarall means for validation years 1 rad 2 were ,1503 
and .1616 reapeetlvely* 



25 



ERIC 



r \ .J"^ ■,". >!:*; . ■•• 



22 = 



TABLE 5 ' 
TUKEY CRITICAL DIFFERENCIS FOR COOT ARI SONS 
OF PAIRS OF mmS CONTAINED. IN 
SIGNIFICANT MAIN EFFECTS. OR INTERACTIONS 



Source 




Crltarion 


Meaiure 




»i 


^2 


Z 




.0398 


.0367 


M 




.0133 


.0093 


ZM 




.0370 


.0258 


MG ^ 




.0174 


.0127 


ZYG \ 






.0251 


mo 




"V .0163 






^ 23 " 



TABLE 6 

MIAN PERCINTAGE^ GAIN IN PREDICTIVl ACCraA 
USING SMOOTHING METHODS GOlffMED TO TMT OF 



METHOD 1 



Method 




Crltarion Index 




1 

2b 

3a 




23. 10 

24 10, 
28 20 


--3b — 
4a 
4b 




"—25-— -""--IB— 

25 17 

26 18 



27 



ERIC 



